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We present a new method for identifying blazar candidates by examining the locus, i.e. the region occupied by 
the Fermi 7-ray blazars in the three-dimensional color space denned by the WISE infrared colors. This method 
is a refinement of our previous approach that made use of the two-dimensional projection of the distribution 
of WISE 7-ray emitting blazars (the Strip) in the three WISE color-color planes [Massaro et al.| 2012a] . In 



this paper, we define the three-dimensional locus by means of a Principal Component (PCs) analysis of the 
colors distribution of a large sample of blazars composed by all the ROMA-BZCAT sources with counterparts 
in the WISE All-Sky Catalog and associated to 7-ray source in the second Fermi LAT catalog (2FGL) (the 
WISE Fermi Blazars sample, WFB). Our new procedure, as reported in DAbrusco et al. 2013], yields a total 
completeness of ctot ~81% and total efficiency of etot ~97%. 



1. Introduction 



Unveiling the nature of the Unidentified Gamma- 
ray Sources (UGS) is one of the main scientific objec- 
tives of the ongoing Fermi 7-ray mission. Recently, 
several attempts have been performed to associate or 
characteriz e the UGSs, e i ther using X-ray observa- 
tions [e.g., iMiraball I2009I |Mirabal fc Halpern||2009 



et al. 



2010 



or with statistical approaches [e.g. Ackermann et al 



2012[ |Mirabal et al.||2010] . Nevertheless, according 



to INolan et al.l 120121, 31% of the 7-ray sources in the 
second Fermi LAT catalog (2FGL) remain unidenti- 
fied and many of these unidentified sources could be 
blazars, since blazars are known to dominate the 7-ray 
sky [e.g. |Abdo et al.||2010| |Hartman et al.||1999| , and 
among the 1297 associated sources within the 2FGL, 
(62%) are known blazars Nolan et al 



2012 



805 

Therefore it is important to devise an efficient means 
of identifying candidate blazars among these sources. 

Blazars come in two main classes: the BL Lac ob- 
jects, which have featureless optical spectra, and the 
more luminous Flat-Spectrum Radio Quasars which, 
typically, show prominent optical spectral emission 
lines 



Stickel et al. 1991 



In the following discus- 
sion, we label the BL Lac objects as BZBs and 
the Flat-Spectrum Radio Quasars as BZQs, following 
the nomenclature of the Multi-wavelength Catalog of 
blazars [ROMA-BZCAT, |Massaro et aL|2009 . 



Using the preliminary data release of the Wide- 
field Infrared Survey Explorer (WISE) [see Wright 



for more details] 1 , we showed that the 
7-ray blazar population occupies a distinctive region 
of the WISE color parameter space (called t he WISE 
Gamma-ray Strip D'Abrusco et al. [2012 



et al. 



Massaro 



Taking advantage of the much larger 
data set now available thanks to the WISE All-Sky 
archive 2 , released in March 2012, in this work we 
present a revisited definition of the region occupied by 
the 7-ray blazars. We will refer to the 3-dimensional 
region occupied by the 7-ray emitting blazars as 
the locus while we will continue to indicate the 2- 
dimensional projection of the locus in the [3.4]— [4.6] vs 
[4.6] — [12] /zm color-color plane as WISE Gamma-ray 
Strip. In this proceeding we determine a geometrical 
description of the WFB locus in the space generated 
by the Principal Components of their distribution in 
the WISE color space and apply our association pro- 
cedure the whole sample of blazars that belong to the 
2FGL. 



2. The WISE Fermi Blazars sample 

We found 3032 out of 3149 (i.e., 96.3% of the 
ROMA-BZCAT) blazars with an IR counterpart 
within 3.3" in the WISE All-Sky data archive. In this 
sample, there are only 2 multiple matches out of 3032 



-'-http: / /wise2. ipac.caltech.edu/docs/release/prclim/ 
2 http: / /wise2. ipac.caltech.edu/docs/release/allsky/ 
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spatial associations, for which we used the IR data of 
the closest WISE source in the following analysis. The 
probability of a chance associations for these 3032 is 
~3.3%, implying that ~ 100 sources associated within 
the above radius could be spurious associations. 

Of these 3032 blazars, 1172 are BZBs, including 919 
BL Lacs and 253 BL Lac candidates, 1642 are BZQs 
and 218 are BZUs. It is also worth noticing that all 
the blazars associated between the ROMA-BZCAT 
and the WISE all-sky data release are detected in 
the first two filters at 3.4 and 4.6 ^m. Among the 
3032 selected blazars, only 673 have a counterpart in 
the 7-rays according to the 2FGL and to the CLEAN 
sample presented in the second Fermi LAT Catalog of 



active galactic nuclei [2LAC; Ackermann et al. 2011 
637/673 (i.e., 94.7%) of these blazars (333 BZBs, 277 
BZQs, and 27 BZUs) are detected in all four WISE 
bands. As in 
7-ray emitting 



Massaro et al. 2012b 



the sample of 
blazars in the ROMA-BZCAT cata- 
log was derived excluding the BZUs sources from our 
sample of 7-ray loud blazars. For this reason, the fi- 
nal WFB sample includes only 610 WISE sources out 
of 673 WISE counterparts. We have used the WFB 
sample to characterize the model of the locus in the 
WISE color space. 



3. The locus parametrization 




BZBs 
BZQs 



Figure 1: Scatterplot of the WFB sources in the 
three-dimensional WISE color space. The spectral class 
of the WFB sources is color-coded, while the gray points 
represent the projections of the WFB sample in the three 
color planes generated by WISE colors. 



The new parametrization of the WFB locus is based 
on a new model of the locus in the PCs space gener- 
ated by the WISE colors of the WFB WISE counter- 
parts, and on revised definition of the statistical quan- 
tity used to evaluate the compatibility of a generic 
WISE source with the locus model, the score. The dis- 
tribution of the WFB sources in the three-dimensional 
WISE color space is axisymmetric along a slew line 
(see Figure [lj , so that a simple geometrical descrip- 
tion of the locus can be determined in the PCs space. 

Principal Component Analysis (PCA) uses an or- 
thogonal transformation to convert a set of observa- 
tions of possibly correlated variables into a set of val- 
ues of linearly uncorrelated variables, the PCs. This 
transformation T : (ci,C2,c 3 ) — » (PCi, PC2, PC3) is 
defined so that the first PC (PCi) accounts for as 
much of the variance in the data as possible, and each 
following component (PC2, PC3, etc. up to the di- 
mensionality of the initial space) has the highest vari- 
ance possible under the constraint that it is orthogonal 
to the preceding components. In our case, the WFB 
sources in the three-dimensional PCs space based on 
their color distributions lie almost perfectly along the 
PCi axis and are distributed symmetrically in the PC2 
vs PC3 plane around the PCi line. Based on the shape 
of the locus in the PCs space, we choose to define 
its geometrical model using a cylindrical parametriza- 
tion, with axis aligned along the PCi axis (see Fig[3]). 
The locus, as a whole, is modeled by three distinct 



cylinders: the first two of these cylindrical regions 
are dominated by BZB and BZQ sources respectively, 
while the third cylinder is defined as the region where 
the WFB population is mixed in terms of spectral 
classes (the Mixed region, hereinafter). 

The upper and lower boundaries of the model along 
the PCi axis have been determined requiring that 
90% of the total number of WFB sources is contained 
within the boundaries of the cylinder, with 5% of the 
sources outside of the boundaries of the model on each 
side of the model along the PCi axis. The boundaries 
of the Mixed section along the PCi axis have been de- 
fined by requiring that, in this region, the fraction of 
either spectral class is smaller than 80% of the total 
number of WFB sources. The three boundaries along 
the PCi axis defining the three sections of the WFB 
locus model are shown in Figure. [2j 

The variances of the distribution of the WFB distri- 
bution in the PC space along the second and third 
PCs are crf> C2 = 0.61 and Op C;) = 0.58 respectively. 
Based on this fact, we have modeled the bases of the 
cylinders as circles centered on the axis of the first 
principal component PCi (the variance of the WFB 
distribution along PCi is Cp C3 = 1.53). The radii of 
the circular bases of each of the three cylinders repre- 
senting the three different sections of the WFB locus 
in the PCs space have been determined independently 
as the radii containing the 90% of the WFB sources in 
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Figure 2: Boundaries of the three sections of the WFB 
locus in the PCs space along the PCi axis. The solid 
black line represent the "purity" of the WFB population, 
i.e. the fraction of the dominant spectral class relative to 
the other spectral class. The solid red and blue lines 
represent the fraction of BZQs and BZBs sources, while 
the histogram in the background represents the 
normalized density of the distribution of the whole WFB 
sample along the PCi axis. The horizontal green line 
shows the threshold used to determine the boundaries of 
the mixed region. 

each section. The radii of each of the three cylinders 
are defined in the plane generated by the PC2 and 

PC3 axes and evaluate d as R — ^PC^+PC^. 

3.1. The score 

The distance of a generic WISE source to the 
model of the WFB locus in the PCs space can be 
evaluated quantitatively using a numeric quantity 
that we call the score. The generic WISE source 
with colors (01,02,03) can be projected onto the PCs 
space by applying the orthogonal transformation de- 
termined by the PCA performed on the WFB sample 
for the modelization of the WFB locus in the PCs 
space. Thus, the position of the generic WISE source 
in the PCs space is determined by the PCs values 
(PCi,PC 2 ,PC 3 ) = T(ci,c 2 ,c 3 ). To take into account 
the uncertainties on the values of the WISE colors, 
the standard deviations on each color are also pro- 
jected onto the PCs space and are used to define the 
error bars on the position of the source in the PCs 

space: (±cr p - Ci ,±<r p - C2 ,±°pc 3 ) = T ( ±(J £i , ±<T c 2 ,±0^ ) . 
We simply assume that the generic WISE source is 
represented in the PCs space by the ellipsoid gen- 
erated by the segments with extremes PC^ ± crpd , 
hereinafter the uncertainty ellipsoid. Each of the six 
points at the extremes of the axes of the uncertainty 
ellipsoid in the PCs space will be generically called 
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Figure 3: Schematic representation of the possible 
positions of the uncertainty ellipsoid of a generic WISE 
source in the PCs space relative to the cylindrical models 
of the WFB locus (see descriptions of the different cases 
in the text). 

extremal point. The possible positions of the uncer- 
tainty ellipsoid associated with a generic WISE source 
relative to each of the three cylinders of the locus 
model (schematically shown in Figure [3] for one two- 
dimensional section of the PCs space) fall in one of the 
following cases: six extremal points within a cylinder 
(point A in Figure [3]); five extremal points within a 
cylinder (point B in Figure [3| ; three extremal points 
within a cylinder (point C in Figure [3]) ; one extremal 
point within a cylinder (points D in Figure |3j); no 
extremal points within any cylinder (points F in Fig- 
ure [3]). Other combinations are not possible because 
the axes of the uncertainty ellipsoids are either par- 
allel or orthogonal to the PCi axis of the PCs space. 
Points with any number of extremal points within two 
cylinders (like point E in Figure [3]) are assigned a dis- 
tinct score value for either cylinder according to the 
number of extremal points contained in each one. 
The score s for a generic WISE source with n extremal 
points contained in one of the three sections of the 
WFB locus model is defined as: 




where 4> is the index of the score assignment law. This 
is a simple generalization of the most natural choice 
that would assign to each extremal point within the 
locus model 1/6, defining the total score of a source as 
linearly proportional to the number of extremal points 
within the model cylinders. This behavior is obtained 
in the general equation when = 1. Changing the 
value of 4> is useful to tweak the performances of the 
association procedure in terms of the purity and com- 
pleteness of the final sample of candidate blazars. 
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So far, the score assigned to a generic WISE source 
can take one of six different values determined by the 
score assignment law in Equation [T] To penalize the 
WISE sources with large uncertainties on the observed 
colors (and, in turn, large volume of the uncertainty 
ellipsoid in the PCs space) relatively to other WISE 
sources with the same number of extremal points con- 
tained in the locus model but smaller errors, we mul- 
tiply the score obtained using Eq. [I] by the ratio of 
the absolute values of the logarithms of the volume of 
the uncertainty ellipsoid of the source considered and 
of the volume of the largest uncertainty ellipsoid for 
WFB sources. Thus, for each of the three regions of 
the locus model, the weighted score is defined as: 

„ _ II log ^11 



|| log(max(V W FB))|| 

where Vwfb are the volumes of the uncertainty ellip- 
soids of the WFB sources in the PCs space calculated 
as V\vfb = f 7rcr PCi <:r PCi <:r PC3! an d V is the volume 
of the uncertainty ellipsoid in the PCs space of the 
generic WISE source considered. The logarithms of 
the volumes of the uncertainty ellipsoids are used to 
take into account the large number of order of mag- 
nitude potentially spanned by the differences between 
the volumes (always smaller than one in the PCs space 
though). The above definition of the weighted score 
also has the effect of mapping the discrete distribu- 
tion of scores calculated according to assignment law 
Equation [l] into a continuous distribution that allows 
a finer classification of the candidate blazars. 



4. Selection of the candidate blazars 

The procedure for the evaluation of the scores based 
on the new parametrization of the WFB locus dis- 
cussed in the previous section is used to associate 
high-energy sources to WISE candidate blazars. The 
WISE colors and their uncertainties for all the sources 
found in the WISE All-Sky photometry catalog within 
the region of positional uncertainty (hereinafter the 
Search Region - SR) of a given high-energy source 
and detected in all four WISE filters are retrieved, 
and the scores of these WISE sources are calculated 
as described in Section 13.11 Then, these sources are 



split among different classes according to the values 
of the their scores Sf,, s m and s q for the BZB, Mixed 
and BZQ regions of the WFB locus model in the PCs 
space respectively. For each locus region, every source 
is assigned to class A, class B, class C or is marked 
as an outlier based on its score values and relative to 
the threshold scores values defined as the 30%, 60% 
and 90% percentiles of the distributions of scores in 
the three regions of the locus for the WFB sources (see 
Figure [4]). The classes are sorted according to decreas- 
ing probability of the WISE source to be compatible 




s 30% 


s eb7 





















Figure 4: Histograms of distributions of score values 
calculated for the sources in the WFB sample for the 
three regions of the locus dominated respectively by the 
BZQs, the BZBs and in the mixed region (upper, mid 
and and lower panels respectively). The three vertical 
lines in each panel represent s 30 %, s 60 % and Sgo%. These 
thresholds have been used to define the classes of 
candidate blazars (see text). 



Table I Values of the score thresholds S3o%, Seo% and 
Sgo%, used for the association experiments described in 
this proceedings. These values are determined as the 
30%-th, 60%-th and 90%-th percentiles of the scores of 
the WFB sample divided by BZB, Mixed and BZB mixed 
regions. 



BZB Mixed BZQ 

s 30% 0.48 0.44 0.41 

s 60 % 0.75 0.79 0.79 

s 90 % 0.93 0.92 0.94 



with the model of the WFB locus: class A sources are 
considered the most probable candidate blazars for the 
high-energy source in the SR, while class B and class 
C sources are less compatible with the WFB locus but 
are still deemed as candidate blazars. In more details, 
class A candidate blazars have score s < s go %, class 
B candidate blazars have score s 60 % < s < s 90 % and 
class C candidate blazars have score s 30 % < s < S60% 
for each region. The other sources considered outliers 
are discarded. The values of the score thresholds de- 
rived from the score distributions of WFB sources for 
the three regions of the locus model are reported in 
Table [I] and shown in Figure [4] overplotted to the his- 
tograms of the score distributions of the WFB sources 
assigned to each of the three locus regions. 
The choice of the percentiles used to define the classes 
of candidate blazars is arbitrary and can be changed 



eConf C121028 



4 th Fermi Symposium : Monterey, CA : 28 Oct- 2 Nov 2012 



to allow for more conservative (higher purity of the 
sample of candidates) or more complete (lower purity 
of the sample of candidates) selections of candidate 
blazars in the SRs associated with unidentified high- 
energy sources. 

4.1. Associations 

In our association procedure, the presence of WISE 
background sources with score values that would qual- 
ify them as candidate blazars but that are not located 
within the SR of the unidentified high-energy source 
is taken into account by assessing the number and 
type of spurious associations from sources within a lo- 
cal background region for each unassociated source. 
For a generic SR of radius ?"sr,, we define the back- 
ground region (BR) as an annulus of outer radius 
tbk = v2-rsR and inner radius equal to the SR ra- 
dius and centered on the center of the SR. The SR 
and BR have same area by definition. Within a given 
SR, all WISE sources detected in all four WISE fil- 
ters are assigned a score value for each region of the 
locus model, and successively ranked in classes us- 
ing the same thresholds used to classify the sources 
within the SR. An example of a generic SR and asso- 
ciated background region is shown in Figure |5j where 
the candidate blazar and the spurious BR candidate 
blazar are colored according to their class membership 
as defined in Section |U 

For every unassociated high-energy source, our 
method produces all candidate blazars (sources classi- 
fied as class A, class B or class C candidate) in the SR. 
All candidate blazars located in the BR of the high- 
energy sources are also provided and can be used to 
evaluate the chance of spurious associations as a func- 
tion of the class of the candidate blazars. 



5. Conclusions 

In this proceeding we have described the WFB sam- 
ple of 7-ray emitting WISE blazars, gathered using the 
new WISE All-Sky release, the 2FGL catalog and the 
latest release of the ROMA-BZCAT catalog. Then, 
we have presented a new association procedure for the 
unidentified high-energy sources based on a new model 
of the locus occupied by WFB sample in the three- 
dimensional PCs space generated by the distribution 
of WFB WISE sources in the WISE color space. We 
defined a quantitative measure of the compatibility of 
a generic WISE source with the locus model and ex- 
pounded the new association procedure. This method 
can select candidate blazars classified as BZB or BZQ 
candidates and ranked according to the likelihood of 
each candidate of being an actual blazar. We also in- 
vestigated the possibility of spurious associations by 




Figure 5: Results of the association procedure for a 
generic unassociated high-energy source superimposed on 
the image of the WISE sky around the position of the 
unassociated 7-ray source as seen in the [3.4]^m band. 
The inner circle represents the Search Region (SR) of the 
high-energy source while the outer circle delimits the 
annulus used as Background Region (BR). The open 
circles in the SR represent the sources of the WISE 
All-Sky catalog detected in all four WISE filters for 
which the scores have been evaluated (the sources not 
marked by symbols in the image are not detected in at 
least one of the four WISE filters and have not been 
considered for the score evaluation). The solid circle 
represents the candidate blazar found within the SR and 
its color indicates that it is class A candidate blazar. 



determining the number and class of WISE sources 
compatible with the model of the WFB locus in back- 
ground regions defined around the SR of each high- 
energy source. 

The performances of the method in terms of 
the efficiency and completeness have been estimated 
in 

: 81% respec- 



D'Abrusco et al. 2013 , yielding a total efficiency 
etot — 97% and total completeness c to t : 
tively. By using a _ftT-fold cross-validation approach, 
we have also estimated the efficiency and completeness 
as functions of the WISE colors and galactic coordi- 
nates of the candidate blazars. 

we have presented the 



In D'Abrusco et al. 2013 



catalog of candidate blazars associated with the new 
procedure to the 2FGL 7-ray sources included in the 
WFB sample, used to define the new model of the lo- 
cus. We have also discussed the catalog of candidate 
blazars obtained by applying the new association pro- 
cedure to the 2FB sample, composed of all clean 7-ray 
sources associated with blazars in the 2FGL catalog 
but not contained in the WFB sample. We will make 
the code for the association and both catalogs of can- 
didate blazars publicly available. 
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